A hybrid approach to electrolaryngeal speech enhancement based on spectral subtraction and statistical voice conversion

نویسندگان

Kou Tanaka

Tomoki Toda

Graham Neubig

Sakriani Sakti

Satoshi Nakamura

چکیده

We present a hybrid approach to improving naturalness of electrolaryngeal (EL) speech while minimizing degradation in intelligibility. An electrolarynx is a device that artificially generates excitation sounds to enable laryngectomees to produce EL speech. Although proficient laryngectomees can produce quite intelligible EL speech, it sounds very unnatural due to the mechanical excitation produced by the device. Moreover, the excitation sounds produced by the device often leak outside, adding noise to EL speech. To address these issues, previous work has proposed methods for EL speech enhancement through either noise reduction or voice conversion. The former usually causes no degradation in intelligibility but yields only small improvements in naturalness as the mechanical excitation sounds remain essentially unchanged. On the other hand, the latter method significantly improves naturalness of EL speech using spectral and excitation parameters of natural voices converted from acoustic parameters of EL speech, but it usually causes degradation in intelligibility owing to errors in conversion. We propose a hybrid method using the noise reduction method for enhancing spectral parameters and voice conversion method for predicting excitation parameters. The experimental results demonstrate the proposed method yields significant improvements in naturalness compared with EL speech while keeping intelligibility high enough.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Hybrid Approach to Electrolaryngeal Speech Enhancement Based on Noise Reduction and Statistical Excitation Generation

This paper presents an electrolaryngeal (EL) speech enhancement method capable of significantly improving naturalness of EL speech while causing no degradation in its intelligibility. An electrolarynx is an external device that artificially generates excitation sounds to enable laryngectomees to produce EL speech. Although proficient laryngectomees can produce quite intelligible EL speech, it s...

متن کامل

Evaluation of Excitation Feature Prediction in a Hybrid Approach to Electrolaryngeal Speech Enhancement

We implement removing micro-prosody with low-pass filtering and avoiding Unvoiced/Voiced (U/V) prediction as part of a hybrid approach to improve statistical excitation prediction in the hybrid approach to electrolaryngeal (EL) speech enhancement. An electrolarynx is a device that artificially generates excitation sounds to enable laryngectomees to produce EL speech. Although proficient larynge...

متن کامل

Electrolaryngeal speech enhancement based on statistical voice conversion

This paper proposes a speaking-aid system for laryngectomees using GMM-based voice conversion that converts electrolaryngeal speech (EL speech) to normal speech. Because valid F0 information cannot be obtained from the EL speech, we have so far converted the EL speech to whispering. This paper conducts the EL speech conversion to normal speech using F0 counters estimated from the spectral infor...

متن کامل

A digital signal processor implementation of silent/electrolaryngeal speech enhancement based on real-time statistical voice conversion

In this paper, we present a digital signal processor (DSP) implementation of real-time statistical voice conversion (VC) for silent speech enhancement and electrolaryngeal speech enhancement. As a silent speech interface, we focus on nonaudible murmur (NAM), which can be used in situations where audible speech is not acceptable. Electrolaryngeal speech is one of the typical types of alaryngeal ...

متن کامل

Using Context-based Statistical Models to Promote the Quality of Voice Conversion Systems

This article aims to examine methods of optimizing GMM-based voice conversion systems performance in which GMM method is introduced as the basic method for improvement of voice conversion systems performance. In the current methods, due to using a single conversion function to convert all speech units and subsequent spectral smoothing arising from statistical averaging, we will observe quality ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

A hybrid approach to electrolaryngeal speech enhancement based on spectral subtraction and statistical voice conversion

نویسندگان

چکیده

منابع مشابه

A Hybrid Approach to Electrolaryngeal Speech Enhancement Based on Noise Reduction and Statistical Excitation Generation

Evaluation of Excitation Feature Prediction in a Hybrid Approach to Electrolaryngeal Speech Enhancement

Electrolaryngeal speech enhancement based on statistical voice conversion

A digital signal processor implementation of silent/electrolaryngeal speech enhancement based on real-time statistical voice conversion

Using Context-based Statistical Models to Promote the Quality of Voice Conversion Systems

عنوان ژورنال:

اشتراک گذاری